Analysis plan

Author

Lucija Batinović

Synthesis plan

We will conduct an individual participant data meta-analysis, by modelling a Bayesian, beta-binomial hierarchical regression. Our effect size will be presented as odds ratio between baseline and intervention phase, which represents the trend and variability aspects of SCED analysis according to WWC guidelines. The analyses will be performed in R (R Core Team, 2023; R version 4.3.1), using the brms (Bürkner (2017)) package, an R wrapper for the probabilistic programming language Stan (Stan Development Team, 2024).
We will conduct separate analyses for each category of reading intervention (phonological awareness, fluency, and phonics), as we do not consider them conceptually homogeneous enough to aggregate in one model. Likewise, we will group the analyses based on designs that can be synthesized in the same model. With four groups based on design, and three interventions (phonological awareness, fluency, and phonics), we anticipate at most 12 main meta-analytic models, given that we find at least two eligible studies for each category. Table 2 illustrates possible combinations and designs we plan to synthesise together.

Table 2
Study design and intervention matrix

Major Design Groups Subdesigns Intervention Categories
1. AB and Multiple Baseline Designs AB Design  Multiple Baseline across participants  Multiple Baseline across settings/situations  Multiple Baseline across time  Multiple Probe Design A) Phonological Awareness  B) Fluency  C) Phonics
2. Changing Criterion Designs Changing Criterion (CC) Design A) Phonological Awareness  B) Fluency  C) Phonics
3. Multiple-Treatment Designs Multi-element Design  Alternating-Treatment Design (ATD)  Adapted Alternating-Treatment Design  Simultaneous-Treatment Design A) Phonological Awareness  B) Fluency  C) Phonics
4. Reversal (ABAB) Designs ABAB (Reversal/Withdrawal) Design A) Phonological Awareness  B) Fluency  C) Phonics
Combination Designs / Other NA Not applicable

In the first group, we will aggregate multiple baseline designs (and their variants), and multiple probe designs. We consider these designs similar in logic, and as long as the intervention and outcome are kept homogeneous, we can model the variations in measurement and intervention setup across stages as random variation.
In the second group, we will synthesise changing-criterion (CC) designs. We will model CC designs according to the same formula as multiple baseline/AB designs, but separately from those designs. This aligns with the suggested analytical approach by W. R. Shadish, Kyse, and Rindskopf (2013), which suggested an analytical approach where one regards all criterion phases as the intervention phase and compares to the baseline (ignoring the stabilizing phases between criterion phases). Although modelling the CC design this way would functionally make it comparable to multiple baseline/AB design setup, the staggered nature of CC designs might artificially flatten the slope compared to AB or multiple baseline designs, where the mastery of the final desired outcome may come immediately, or at least early in the intervention phase.

The third group will consist of comparative study designs designs (Ledford & Gast, 2018). For studies using ATD, we will follow a structured extraction workflow: First, we will verify that a baseline phase is present, either as a pre-intervention phase or a comparison phase without treatment. If no baseline is available, the study will be excluded from extraction. If a baseline is present, we will assess the treatment phases. If both treatments do not belong to the same intervention category, the study will not be included in the analysis. Studies without a baseline/no treatment condition or comparing different intervention categories would not answer our research question which does not focus on the relative effectiveness of interventions. The baseline data will be explicitly modeled as “Baseline,” and the superior treatment data will be modeled as “Intervention,” extracting all available points. In all cases, data extraction will be done sequentially to maintain the chronological order of sessions in the dataset. These analyses follow modelling alternatives suggested by Manolov, Onghena, and Noortgate (2023) and Moeyaert et al. (2015), based on the decision trees driven by the research question.

If we happen to find alternating/reversal phase designs (i.e., ABAB designs), we will aggregate them in the fourth group, although due to the nature of the intervention (learning effects), we do not expect to find studies with these designs.

We will not model interventions categorized as other as we expect that category to be overly heterogeneous.

We will model the baseline and intervention phase separately in our main meta-analysis model, with interaction with time (session). We will also not model autocorrelation. Although the literature suggests the existence of moderate autocorrelation, particularly in multiple-baseline designs (William R. Shadish and Sullivan (2011)), we could model it as an exploratory analysis only due to unknown estimates.

To answer our research question, we will provide posterior distribution plots of the average intervention effect (fixed effect) and the interaction effect between session and intervention. Furthermore, we will plot the posterior distributions of the random-effects estimates (SD) to assess heterogeneity of intervention effects between studies, and between participants within studies. We will provide estimates with 90% credible intervals for each effect, allowing readers to understand both the average intervention impact and the variability of effects across different studies and participants. As estimates will be presented on a log-odds scale, we will supplement them with converted odds ratios for interpretability.

Level one of the multilevel model describes the relationship between the outcome - word or non-word reading, and the predictor - reading intervention. Informed by previous reviews with similar interest, we assume majority of the studies will employ non-standardized word-reading tests, with counts and percentages as measures, and fixed word lists. We assume the existence of autocorrelation (but will not model it explicitly) and knowledge transfer, therefore, we cannot assume the trials are independent. To allow for greater flexibility and account for over-dispersion (i.e., variance being larger than the mean), and to account for unavailable raw counts in instances where the studies present proportion of correct responses, we will model the outcome under the beta-binomial distribution (Gelman et al. (2020)). Figure 1 illustrates the multiple levels in a meta-analysis.

To calculate the effect for our meta-analysis, we need to account for participant and study clusters. There are two ways to approach the raw data - calculate average effects per study and do a meta-analysis of those effects (which is the common way of conducting a meta-analysis when you don’t have raw data) or build a model that directly calculates the population (average) effect from the raw data. We will do the latter, the one stage approach.

Figure 1 Levels of the hierarchical model in SCEDs

Thus, our starting model is as follows:

\[ Y_{ijk} \sim BetaBin(N_{ijk}, \mu_{ijk}, \theta) \]

Beta-binomial distribution can be explained as a binomial distribution that has its probability parameter informed by the beta distribution of probabilities. \(Y_{ijk}\) represents the correct response rate for each individual (j) within a study (k) at a certain time point i = 1, …, n. \(BetaBin\) indicates the observations are drawn from a beta-binomial distribution which has the \(N_{ijk}\) parameter for the number of trials, \(\mu_{ijk}\) for the average probability (Some authors denote it as pbar; McElreath (2020)) of the beta distribution and \(\theta\) as the dispersion (precision) parameter of the beta distribution (some authors might note it as \(\phi\); Bürkner (2017)).

\[ \text{logit}(\mu_{ijk}) = \alpha + \beta_1 ID_{j} + \beta_2 \text{session}_{i} + \beta_3 ID_j*\text{session}_{i} + g_{jk} + g_k \]

Here, \(\alpha\) is the global intercept, \(\beta_1\) represents the main effect of the ID (i.e., phase), \(\beta_2\) represents the main effect of session (i.e., measurement occasions), \(\beta_3 ID_j*session_i\) represents the interaction of session and phase, \(g_{jk}\) is the participant-level random effect nested within the study, \(g_k\) is the study-level random effect. The ID variale stands for phase. In case of AB, multiple baseline and changing criterion designs, phase will represent both baseline and intervention. In eligible alternating treatment designs, it will represent baseline, intervention 1 and intervention 2.

The multilevel model now specifies random effects on the participant (\(g_{j}\)) and study level (\(g_{k}\)). We also define parameters for priors for the means and standard deviations. As these effects are interdependent, the model also produces a correlation matrix to explain the covariance. Weakly informative priors drawn from a Student-t or a Cauchy distribution have been recommended by Gelman (2006) for hierarchical models, particularly when the number of groups is larger than 3, which we expect in our meta-analyses. Suggested informative priors come from Grekov, Pustejovsky, and Klingbeil (2025). However, their specified priors fit the baseline phase only, while we also include the intervention phase. Nevertheless, their data are highly similar to our target data, as they modelled reading fluency outcomes from single-case designs.

Furthermore, we will incorporate immediate level changes and time trends through session effects and their interaction with the phases. This allows us to capture both the immediate difference and gradual changes happening during the intervention/baseline phase. The model includes random slopes for both session and intervention × session effects at both study and case levels, enabling us to detect heterogeneity in learning trajectories across studies and participants.

LKJ, the Lewandowski-Kurowicka-Joe distribution is a probability distribution over positive-definite symmetric matrices with unit diagonals, i.e., correlation matrices (Gelman et al. (2020)). We use this distribution as a prior for the correlation matrix of effects in the model and set it as \(LKJcorr(2)\) of 2, which makes extreme correlations less likely in the correlation matrix but still allows for correlations between the effects (McElreath (2020)).

We draw standard deviations from the half-student-t distributions, instead of normal, as it allows the possibility of extreme (tail) values, which helps us accommodate to uncertainty of these values. By defining it as “half”, we constrain the distribution only to positive values for the SD.

Given the way that the designs are bounding the possible answers to a small range, and often conduct interventions until 100% success rate is achieved, we can expect ceiling effects, large effects in the intervention phase and larger improvements (slope) for those with lower baseline results.

The hierarchical structure of the random effects follows:

\[ g_{jk} \sim \mathcal{N}(0, \sigma_{j[k]}) \]

\[ g_k \sim \mathcal{N}(0, \sigma_{k}) \]

where \(g_{jk}\) represents variability among cases within studies, \(g_{k}\) stands for variability across studies, \(\sigma_{j[k]}\) and \(\sigma_{k}\) are the standard deviations at each level.

The variance components follow a hierarchical prior:

\[ \sigma_{k} \sim \text{Half-Student}(3, 0, 2.5) \]

\[ \sigma_{j[k]} \sim \text{Half-Student}(3, 0, 2.5) \]

\[ \mathbf{R} \sim LKJcorr(2) \]

\[ \alpha \sim \text{Student}(3, 3.9, 2.5) \]

\[ \beta_1, \beta_2, \beta_3 \sim \text{Student}(3, 0, 2.5) \]

\[ \theta = \Phi \]

\[ \Phi \sim gamma(1,1) \]

Moderator analysis

We will conduct a moderator analysis based on IQ, as findings by Allor et al. (2014) suggest that lower IQ is linked to lower rates of reading development.
We will introduce IQ as a continuous predictor in the main model, centered around the group mean. While grand-mean centering might be appropriate given that it is one population of participants, in this case, group-mean centering is preferable due to the likely non-random selection of intellectual disability severity within studies (Kreft, and Aiken (1995), Enders and Tofighi (2007)). Specifically, the selection of participants with a particular severity level (e.g., mild intellectual disability) may influence the intervention complexity. If IQ were centered around the grand mean, this study-level selection bias could distort estimates, making group-mean centering a more appropriate choice to account for within-study variability.

\[ \text{logit}(\mu_{ijk}) = \alpha + \beta_1 ID_{j} + \beta_2 \text{session}_{i} + \beta_3 ID_j*\text{session}_{i} + g_{jk} + g_k + \gamma_{IQ[j]} \] \[ \gamma_{IQ[j]} \sim \mathcal{N}(\mu_{IQ[j]}, \sigma_{IQ[j]}) \]

We will include IQ as a continuous variable rather than using categorical intellectual disability severity, as a continuous measure provides more granular and informative modeling of variation. Additionally, there are potentially two types of missing values:

Missing at random (MAR) — where IQ scores are either not reported or not permitted for sharing.
Missing not at random (MNAR) — where IQ scores are unavailable because participants could not be reliably tested due to the severity of their disability.

If the data is not missing at random, we will not conduct a moderator analysis. In case of an MAR type missingness, we will impute the values with the mice package (van Buuren and Groothuis-Oudshoorn (2011)).

Model building workflow

We have pre-specified possible alternatives for the models, based on assumptions about the outcome data and intervention characteristics. However, as we do not have data at the moment of specifying the models, we propose a workflow to determine which model will be selected as the main model once we obtain the data.

We will graphically inspect the data to estimate the growth curve based on sessions, to determine whether time influences the outcome linearly or exponentially.

While there is a possibility that none of the models will provide an adequate fit, this appears unlikely based on pilot analyses of similar interventions. If none of the models are satisfactory based on the criteria followed in the workflow, we will report all models in the main analysis and interpret the results with appropriate caution, explicitly discussing the limitations.  Additionally, we will perform exploratory analyses to better understand the data.

Table 3 presents alternate priors and functions of the predictors that we will apply to the data and test the model fit. We will select the best model for each meta-analysis as the main model and provide results of the other model alternatives as a sensitivity analysis.

Table 3
Model Specifications and Priors

Model Session Specification Priors Notes
1A Linear (session) Default brms priors Baseline model
1B Linear (session) - Student_t(3, 0, 2.5) for IDIntervention slope
- Flat() for session slope
- Student_t(3, 3.9, 2.5) for intercept
- LKJ(2) for correlations
- Gamma(1, 1) for phi
- Student_t(3, 0, 2.5) for all group-level SDs
Weakly informative priors based on literature; flat prior on session
2A Splined (session) Default brms priors Non-linear learning trajectory using splines
2B Splined (session) - Student_t(3, 0, 2.5) for IDIntervention
- Normal(0, 0.1) for spline basis coefficients (e.g., ssession_1, etc.)
- Student_t(3, 3.9, 2.5) for intercept
- LKJ(2) for correlations
- Gamma(1, 1) for phi
- Student_t(3, 0, 2.5) for group-level SDs
- Student_t(3, 0, 2.5) for spline smoothness (sds)
Informed priors; compatible with prior predictive sampling

Initial setup of the models and priors

Data setup
# data will be cleaned here for analyses. We will filter out maintenance and generalization phases for the analysis. We will also filter data based on SCED design and run all models for each category of designs. 
data <- read_csv(here("analysis", "mydata.csv")) %>% 
  mutate(success_rate = round(success_rate*100, digits = 0),
         fail_rate = round(abs(fail_rate)*100, digits = 0)) %>% 
  filter(ID != "Maintenance") %>% # remove unnecessary rows that will not be modeled
  # filter (intervention == "multiple baseline") %>%  # filter specific SCED designs to model together
  group_by(study, case, ID) %>%  
  mutate(session = row_number()) %>%  
  ungroup() %>%
  mutate(ID = as.factor(ID))

#data %>% filter(ID == "intervention") %>% ggplot(aes(x = x, y = success_rate, color = case)) +
#  geom_point() +
#  geom_line() +
#  facet_wrap(~study, ncol = len())

Multiple baseline/probe and AB designs; changing-criterion designs; alternating treatment design

Code for the prior and formula setup for MB/AB
# regular session formula
model_formula <- bf(success_rate|trials(success_rate + fail_rate) ~ session + ID + session:ID + (1 + session + ID | study/case))

# sigmoid session formula
model_formula2 <- bf(success_rate|trials(success_rate + fail_rate) ~ s(session) + ID + s(session, by = ID) + (1 + session + ID | study/case)) 

## priors for default models (linear and sigmoid)
priors_model_formula <- get_prior(model_formula, data = data, family = beta_binomial())  # linear + default priors
priors_model_formula$prior[1] <- "normal(0,100)" # for prior predictive check, remove after data collection
priors_model_formula2 <- get_prior(model_formula2, data = data, family = beta_binomial())  # sigmoid + default priors
priors_model_formula2$prior[1] <- "normal(0,100)" # for prior predictive check, remove after data collection

## Specifying informative priors for linear model (1B)
priors_model_formula3 <- c(
  set_prior("student_t(3, 0, 2.5)", class = "b", coef = "IDIntervention"),   # Fixed effect of intervention
  set_prior("normal(0,0.1)", class = "b", coef = "session"),                 # uninformative prior for session
  set_prior("lkj(2)", class = "cor"),                                        # Prior for correlation matrices
  set_prior("student_t(3, 3.9, 2.5)", class = "Intercept"),                  # Intercept (baseline phase)
  set_prior("gamma(1, 1)", class = "phi", lb = 0),                           # Phi (precision)
  
  # Random effects priors
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study:case"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study", coef = "IDIntervention"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study", coef = "Intercept"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study:case", coef = "IDIntervention"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study:case", coef = "Intercept")
)

## Specifying informative priors for sigmoid model (2B)
priors_model_formula4 <- c(
  # Fixed effects
  set_prior("student_t(3, 0, 2.5)", class = "b", coef = "IDIntervention"),
  set_prior("normal(0,0.1)", class = "b", coef = "ssession_1"),  # uninformative prior
  set_prior("lkj(2)", class = "cor"), # Correlations
  set_prior("student_t(3, 3.9, 2.5)", class = "Intercept"),   # Intercept baseline
  set_prior("gamma(1, 1)", class = "phi", lb = 0),
  # Random effect SDs
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study:case"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study", coef = "IDIntervention"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study", coef = "Intercept"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study:case", coef = "IDIntervention"),
  set_prior("student_t(3, 0, 2.5)", class = "sd", group = "study:case", coef = "Intercept"),
  # Prior for spline smooth term 
  set_prior("student_t(3, 0, 2.5)", class = "sds", coef = "s(session)")
)

Code for the models

Code for the models
# model 1A regular session default priors
fit1 <- brm(formula = model_formula,
            data = data,
            family = beta_binomial(link = "logit"),
            prior = priors_model_formula,
            warmup = 2000, iter = 4000,
            control = list(adapt_delta = 0.99), # fix divergent transitions
            core = parallel::detectCores(),
            init_r=0.1
            )

# model 1B sigmoid session default priors
fit2 <- brm(formula = model_formula2,
            data = data,
            family = beta_binomial(link = "logit"),
            prior = priors_model_formula2,
            warmup = 2000, iter = 4000,
            control = list(adapt_delta = 0.99), # fix divergent transitions
            core = parallel::detectCores(),
            init_r=0.1
            )            
                1280.66 seconds (Total)
Chain 3: 
Code for the models
# model 2A regular session informed priors
fit3 <- brm(formula = model_formula,
            data = data,
            family = beta_binomial(link = "logit"),
            prior = priors_model_formula3,
            warmup = 2000, iter = 4000,
            control = list(adapt_delta = 0.99), # fix divergent transitions
            core = parallel::detectCores(),
            init_r=0.1
            )            
             2247.25 seconds (Total)
Chain 1: 
                1280.66 seconds (Total)
Chain 3: 
Code for the models
# model 2B sigmoid session informed priors
fit4 <- brm(formula = model_formula2,
            data = data,
            family = beta_binomial(link = "logit"),
            prior = priors_model_formula4,
            warmup = 2000, iter = 4000,
            control = list(adapt_delta = 0.99), # fix divergent transitions
            core = parallel::detectCores(),
            init_r=0.1
            )

Step 1 - Prior Predictive Checks

Figure 2 shows prior predictive checks for the 4 suggested models, based on different session function and different priors. The draws are sampled from a sample dataset of single case designs with a proportion outcome, but in a different subject, for purposes of building and testing the model before collecting real data.

Prior pred check code
#
color_scheme_set("red")
## models are fit with sample_prior = "only" to fit the drawn samples without the observed data
pp_fit1 <- brm(formula = model_formula,family = beta_binomial(link = "logit"), prior = priors_model_formula, 
               sample_prior = "only", data = data)

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5.3e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.53 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 1.149 seconds (Warm-up)
Chain 1:                0.579 seconds (Sampling)
Chain 1:                1.728 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 2.4e-05 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.24 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 1.378 seconds (Warm-up)
Chain 2:                0.37 seconds (Sampling)
Chain 2:                1.748 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 2.7e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.27 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 1.356 seconds (Warm-up)
Chain 3:                5.116 seconds (Sampling)
Chain 3:                6.472 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 2.8e-05 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.28 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 1.65 seconds (Warm-up)
Chain 4:                1.006 seconds (Sampling)
Chain 4:                2.656 seconds (Total)
Chain 4: 
Prior pred check code
# model 1B sigmoid session default priors
pp_fit2 <- brm(formula = model_formula2, family = beta_binomial(link = "logit"), prior = priors_model_formula2, 
               sample_prior = "only", data = data)

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 7.7e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.77 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 2.493 seconds (Warm-up)
Chain 1:                3.394 seconds (Sampling)
Chain 1:                5.887 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 3.6e-05 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 2.025 seconds (Warm-up)
Chain 2:                1.238 seconds (Sampling)
Chain 2:                3.263 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3.6e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.36 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 1.786 seconds (Warm-up)
Chain 3:                0.646 seconds (Sampling)
Chain 3:                2.432 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 3.5e-05 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.35 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 3.902 seconds (Warm-up)
Chain 4:                0.37 seconds (Sampling)
Chain 4:                4.272 seconds (Total)
Chain 4: 
Prior pred check code
# model 2A regular session informed priors
pp_fit3 <- brm(formula = model_formula, family = beta_binomial(link = "logit"), prior = priors_model_formula3, 
               sample_prior = "only", data = data)

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 4.9e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.49 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.547 seconds (Warm-up)
Chain 1:                0.419 seconds (Sampling)
Chain 1:                0.966 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 3.1e-05 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.31 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.476 seconds (Warm-up)
Chain 2:                0.345 seconds (Sampling)
Chain 2:                0.821 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.3 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.467 seconds (Warm-up)
Chain 3:                0.27 seconds (Sampling)
Chain 3:                0.737 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 2.9e-05 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.29 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.517 seconds (Warm-up)
Chain 4:                0.446 seconds (Sampling)
Chain 4:                0.963 seconds (Total)
Chain 4: 
Prior pred check code
# model 2B sigmoid session informed priors
pp_fit4 <- brm(formula = model_formula2, family = beta_binomial(link = "logit"), prior = priors_model_formula4, 
               sample_prior = "only", data = data)

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 1).
Chain 1: 
Chain 1: Gradient evaluation took 5.9e-05 seconds
Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.59 seconds.
Chain 1: Adjust your expectations accordingly!
Chain 1: 
Chain 1: 
Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 1: 
Chain 1:  Elapsed Time: 0.758 seconds (Warm-up)
Chain 1:                0.652 seconds (Sampling)
Chain 1:                1.41 seconds (Total)
Chain 1: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 2).
Chain 2: 
Chain 2: Gradient evaluation took 6.5e-05 seconds
Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.65 seconds.
Chain 2: Adjust your expectations accordingly!
Chain 2: 
Chain 2: 
Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 2: 
Chain 2:  Elapsed Time: 0.741 seconds (Warm-up)
Chain 2:                1.395 seconds (Sampling)
Chain 2:                2.136 seconds (Total)
Chain 2: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 3).
Chain 3: 
Chain 3: Gradient evaluation took 3.8e-05 seconds
Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.38 seconds.
Chain 3: Adjust your expectations accordingly!
Chain 3: 
Chain 3: 
Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 3: 
Chain 3:  Elapsed Time: 0.648 seconds (Warm-up)
Chain 3:                0.713 seconds (Sampling)
Chain 3:                1.361 seconds (Total)
Chain 3: 

SAMPLING FOR MODEL 'anon_model' NOW (CHAIN 4).
Chain 4: 
Chain 4: Gradient evaluation took 4.3e-05 seconds
Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.43 seconds.
Chain 4: Adjust your expectations accordingly!
Chain 4: 
Chain 4: 
Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
Chain 4: 
Chain 4:  Elapsed Time: 0.832 seconds (Warm-up)
Chain 4:                1.024 seconds (Sampling)
Chain 4:                1.856 seconds (Total)
Chain 4: 
Prior pred check code
pp1 <- pp_check(pp_fit1, prefix = "ppd", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit1 - linear + default")
  
pp2 <- pp_check(pp_fit2, prefix = "ppd", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit2 - linear + informed")
  
pp3 <- pp_check(pp_fit3, prefix = "ppd", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit3 - GAM + default")
  
pp4 <- pp_check(pp_fit4, prefix = "ppd", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit4 - GAM + informed")

pred_plots <- list(pp1, pp2, pp3, pp4)

patchwork::wrap_plots(pred_plots, ncol = 2, nrow = 2)

Figure 2 prior predictive check

Step 2 - Posterior Predictive Checkes

Figure 3 shows posterior predictive checks for the 4 suggested models, based on different session function and different priors, but this time done on observed data after data is collected.

Posterior pred check code
color_scheme_set("blue")
ppc1 <- pp_check(fit1, type = "dens_overlay_grouped", group = "ID", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit1 - linear + default")
  
ppc2 <- pp_check(fit2, type = "dens_overlay_grouped", group = "ID", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit2 - linear + informed")
  
ppc3 <- pp_check(fit3, type = "dens_overlay_grouped", group = "ID", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit3 - GAM + default")
  
ppc4 <- pp_check(fit4, type = "dens_overlay_grouped", group = "ID", ndraws = 100) + theme(legend.position = "none") + ggtitle("Fit4 - GAM + informed")

post_pred_plots <- list(ppc1, ppc2, ppc3, ppc4)

patchwork::wrap_plots(post_pred_plots, ncol = 2, nrow = 2)

Figure 3 prior predictive check

Step 3 - Model Comparisons (LOO and WAIC)

We will conduct leave-one-out (LOO) cross validation, and its expansion, the Watanabe-Aikake (widely applicable; WAIC) information criterion, to select the best fitted model for the meta-analysis. Based on Gelman, Hwang, and Vehtari (2014), LOO and WAIC are more appropriate for partial or complete pooling models. Interpretation will be done based on the WAIC and LOO scores, with the lowest out-of-sample deviance being the criteria for selecting the best fitting model.

Model comparison code
fit1_ic <- add_criterion(fit1, criterion = "loo")
fit1_waic <- waic(fit1)
fit2_ic <- add_criterion(fit2, criterion = "loo")
fit2_waic <- waic(fit2)
fit3_ic <- add_criterion(fit3, criterion = "loo")
fit3_waic <- waic(fit3)
fit4_ic <- add_criterion(fit4, criterion = "loo")
fit4_waic <- waic(fit4)

w <- loo_compare(fit1_waic, fit2_waic, fit3_waic, fit4_waic) 
print(w, simplify = F)
     elpd_diff se_diff elpd_waic se_elpd_waic p_waic  se_p_waic waic    se_waic
fit2     0.0       0.0 -1117.2      28.2         35.6     3.6    2234.4    56.4
fit4    -0.4       0.8 -1117.6      27.9         34.0     3.5    2235.1    55.7
fit1   -13.4       5.3 -1130.6      29.0         27.0     3.0    2261.2    57.9
fit3   -13.8       5.1 -1131.0      28.6         26.0     2.9    2262.0    57.3
Model comparison code
l <- loo_compare(fit1_ic, fit2_ic, fit3_ic, fit4_ic) # leave-one-out cross validation
print(l, simplify = F)
        elpd_diff se_diff elpd_loo se_elpd_loo p_loo   se_p_loo looic  
fit2_ic     0.0       0.0 -1117.8     28.2        36.3     3.6   2235.7
fit4_ic    -0.4       0.7 -1118.2     27.9        34.6     3.6   2236.5
fit1_ic   -13.0       5.3 -1130.9     29.0        27.3     3.0   2261.8
fit3_ic   -13.5       5.2 -1131.3     28.6        26.3     3.0   2262.6
        se_looic
fit2_ic    56.4 
fit4_ic    55.8 
fit1_ic    57.9 
fit3_ic    57.3 

Step 4 - Moderator Analysis

Moderator analysis code
# we will center IQ with group mean and model with random slopes
imp <- mice(data = data, m = 10, print = FALSE) #im
long_imp <- complete(imp, action='long', include=TRUE)
long_imp <- long_imp %>% mutate(IQ_gpm = ave(IQ, study), IQ_gpc = IQ - IQ_gpm)
imp2 <- as.mids(long_imp)

# ensure no zero values so betabinomial doesn't break
long_imp <- long_imp %>%
  rowwise() %>%
  mutate(
    success_rate = ifelse(success_rate == 0, 0.5, success_rate),
    fail_rate = ifelse(fail_rate == 0, 0.5, fail_rate),
    total = success_rate + fail_rate,
    success_rate = round(success_rate / total * 100),
    fail_rate = 100 - success_rate
  ) %>%
  ungroup()


formula_iq <- bf(success_rate|trials(success_rate + fail_rate) ~ session + ID + session:ID + IQ + (1 + session + ID + IQ | study/case))

prior_iq <- get_prior(formula = formula_iq, 
                   data = data, 
                   family = beta_binomial())

fit_iq <- brm_multiple(formula = formula_iq, 
                 data = imp2, 
                 family = beta_binomial(link = "logit"),
                 inits = function() list(phi = 5),  # phi must be > 0
                 prior = prior_iq, 
                 warmup = 1000, iter = 4000,
                 cores = parallel::detectCores(),
                 control = list(adapt_delta = 0.99) # fix divergent transitions
          )
amples.
[[1]]
Stan model 'anon_model' does not contain samples.
Moderator analysis code
summary(fit_iq)
 Family: beta_binomial 
  Links: mu = logit; phi = identity 
Formula: success_rate | trials(success_rate + fail_rate) ~ session + ID + session:ID + IQ + (1 + session + ID + IQ | study/case) 
   Data: imp2 (Number of observations: 333) 
  Draws: 38 chains, each with iter = 4000; warmup = 1000; thin = 1;
         total post-warmup draws = 114000

Group-Level Effects: 
~study (Number of levels: 4) 
                              Estimate Est.Error l-95% CI u-95% CI Rhat
sd(Intercept)                     1.13      1.00     0.07     3.64 1.16
sd(session)                       0.45      0.96     0.06     6.03 1.38
sd(IDIntervention)                1.94      1.23     0.17     4.81 1.24
sd(IQ)                            0.69      1.44     0.00     5.15 1.61
cor(Intercept,session)           -0.04      0.52    -0.95     0.79 1.38
cor(Intercept,IDIntervention)    -0.11      0.54    -0.96     0.94 1.52
cor(session,IDIntervention)      -0.16      0.50    -0.88     0.88 1.40
cor(Intercept,IQ)                -0.05      0.50    -0.85     0.90 1.40
cor(session,IQ)                  -0.16      0.46    -0.95     0.92 1.31
cor(IDIntervention,IQ)            0.07      0.51    -0.81     0.98 1.39
                              Bulk_ESS Tail_ESS
sd(Intercept)                      179     2889
sd(session)                         86       42
sd(IDIntervention)                 112       74
sd(IQ)                              62       40
cor(Intercept,session)              81       45
cor(Intercept,IDIntervention)       67       42
cor(session,IDIntervention)         78       50
cor(Intercept,IQ)                  100       45
cor(session,IQ)                     95       59
cor(IDIntervention,IQ)              79       48

~study:case (Number of levels: 14) 
                              Estimate Est.Error l-95% CI u-95% CI Rhat
sd(Intercept)                     0.79      1.27     0.02     6.27 1.40
sd(session)                       0.53      1.17     0.02     5.46 1.62
sd(IDIntervention)                1.56      1.34     0.18     6.73 1.83
sd(IQ)                            0.16      0.34     0.00     1.40 1.62
cor(Intercept,session)            0.06      0.50    -0.94     0.95 1.39
cor(Intercept,IDIntervention)    -0.29      0.48    -0.94     0.66 1.41
cor(session,IDIntervention)      -0.43      0.44    -0.93     0.81 1.31
cor(Intercept,IQ)                -0.09      0.57    -0.95     0.92 1.48
cor(session,IQ)                   0.03      0.49    -0.95     0.80 1.33
cor(IDIntervention,IQ)           -0.24      0.50    -0.88     0.90 1.28
                              Bulk_ESS Tail_ESS
sd(Intercept)                       97       40
sd(session)                         62       40
sd(IDIntervention)                  55       40
sd(IQ)                              62       40
cor(Intercept,session)              79       48
cor(Intercept,IDIntervention)       78       51
cor(session,IDIntervention)         91       40
cor(Intercept,IQ)                   70       44
cor(session,IQ)                     89       47
cor(IDIntervention,IQ)              99       47

Population-Level Effects: 
                       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept                  6.98     27.99   -90.96    78.37 1.84       54
session                    0.29      0.54    -0.18     1.81 1.63       61
IDIntervention             1.70      1.59    -1.86     4.41 1.48       83
IQ                        -0.22      0.62    -1.73     1.92 1.78       56
session:IDIntervention    -0.05      0.51    -1.66     0.81 1.92       52
                       Tail_ESS
Intercept                    40
session                      40
IDIntervention               45
IQ                           40
session:IDIntervention       40

Family Specific Parameters: 
    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
phi     5.70      0.80     4.29     7.18 1.53       67       44

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Moderator analysis code
# posterior predictive check for baseline and intervention fit
pp_check(fit_iq, type = "dens_overlay_grouped", group = "ID", ndraws = 100)

Moderator analysis code
plot(fit_iq)

Step 5 - Risk of Bias sensitivity analysis

Rob sensitity analysis code
# do a sensitivity analysis based on risk of bias assessment
# data_rob <- data %>% filter(rob != "high")

Step 5 - Visualization

Forest and density plots code
## forest plot of the intervention effect per study (level)
fit1 %>% 
  spread_draws(b_IDIntervention, r_study[study, IDIntervention]) %>%
  # add the grand mean to the group-specific deviations
  mutate(mu = b_IDIntervention + r_study) %>%
  ungroup() %>%
  mutate(study = str_replace_all(study, "[_]", " ")) %>% 
  # plot
  ggplot(aes(x = mu, y = reorder(study, mu))) +
  geom_vline(xintercept = fixef(fit1)[3, 1], color = "gray", linewidth = 1) + #fixed effect of IDIntervention
  geom_vline(xintercept = fixef(fit1)[3, 3:4], color = "black", linetype = 2) +
  stat_halfeye(.width = .95, size = 2/3, fill = "#BA181B", color = "#660708", alpha = 0.6) +
  labs(x = expression(italic("Log Odds Ratio (level)")),
       y = NULL) +
  theme(panel.grid   = element_blank(),
        axis.ticks.y = element_blank(),
        axis.text.y  = element_text(hjust = 0)) +
  theme_minimal()

Forest and density plots code
## forest plot of the session effect per study (trend)
fit1 %>% 
  spread_draws(b_session, r_study[study, session]) %>%
  # add the grand mean to the group-specific deviations
  mutate(mu = b_session + r_study) %>%
  ungroup() %>%
  mutate(study = str_replace_all(study, "[_]", " ")) %>% 
  # plot
  ggplot(aes(x = mu, y = reorder(study, mu))) +
  geom_vline(xintercept = fixef(fit1)[2, 1], color = "gray", linewidth = 1) + #fixed effect of session
  geom_vline(xintercept = fixef(fit1)[2, 3:4], color = "black", linetype = 2) +
  stat_halfeye(.width = .95, size = 2/3, fill = "#B1A7A6", color = "#660708", alpha = 0.6) +
  labs(x = expression(italic("Log Odds Ratio (trend)")),
       y = NULL) +
  theme(panel.grid   = element_blank(),
        axis.ticks.y = element_blank(),
        axis.text.y  = element_text(hjust = 0)) +
  theme_minimal()

Forest and density plots code
## forest plot of the intervention effect per case within study (level)
fit1 %>% 
  spread_draws(b_IDIntervention, `r_study:case`[`study:case`, IDIntervention]) %>%
  # add the grand mean to the group-specific deviations
  mutate(mu = b_IDIntervention + `r_study:case`) %>%
  ungroup() %>%
  mutate(`study:case` = str_replace_all(`study:case`, "[_]", " ")) %>% 
  # plot
  ggplot(aes(x = mu, y = reorder(`study:case`, mu))) +
  geom_vline(xintercept = fixef(fit1)[3, 1], color = "gray", linewidth = 1) + #fixed effect of IDIntervention
  geom_vline(xintercept = fixef(fit1)[3, 3:4], color = "black", linetype = 2) +
  stat_halfeye(.width = .95, size = 2/3, fill = "#BA181B", color = "#660708", alpha = 0.6) +
  labs(x = expression(italic("Log Odds Ratio (level)")),
       y = NULL) +
  theme(panel.grid   = element_blank(),
        axis.ticks.y = element_blank(),
        axis.text.y  = element_text(hjust = 0)) +
  theme_minimal()

Forest and density plots code
## forest plot of the session effect per case within study (trend)
fit1 %>% 
  spread_draws(b_session, `r_study:case`[`study:case`, session]) %>%
  # add the grand mean to the group-specific deviations
  mutate(mu = b_session + `r_study:case`) %>%
  ungroup() %>%
  mutate(`study:case` = str_replace_all(`study:case`, "[_]", " ")) %>% 
  # plot
  ggplot(aes(x = mu, y = reorder(`study:case`, mu))) +
  geom_vline(xintercept = fixef(fit1)[2, 1], color = "gray", linewidth = 1) + #fixed effect of session
  geom_vline(xintercept = fixef(fit1)[2, 3:4], color = "black", linetype = 2) +
  stat_halfeye(.width = .95, size = 2/3, fill = "#B1A7A6", color = "#660708", alpha = 0.6) +
  labs(x = expression(italic("Log Odds Ratio (trend)")),
       y = NULL) +
  theme(panel.grid   = element_blank(),
        axis.ticks.y = element_blank(),
        axis.text.y  = element_text(hjust = 0)) +
  theme_minimal()

Forest and density plots code
fit1 %>% plot()

Info

References

Allor, Jill H., Patricia G. Mathes, J. Kyle Roberts, Jennifer P. Cheatham, and Stephanie Al Otaiba. 2014. “Is Scientifically Based Reading Instruction Effective for Students With Below-Average IQs?” Exceptional Children 80 (3): 287–306. https://doi.org/10.1177/0014402914522208.
Bürkner, Paul-Christian. 2017. “Brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. https://doi.org/10.18637/jss.v080.i01.
Enders, Craig K., and Davood Tofighi. 2007. “Centering Predictor Variables in Cross-Sectional Multilevel Models: A New Look at an Old Issue.” Psychological Methods 12 (2): 121–38. https://doi.org/10.1037/1082-989X.12.2.121.
Gelman, Andrew. 2006. “Prior Distributions for Variance Parameters in Hierarchical Models (Comment on Article by Browne and Draper).” Bayesian Analysis 1 (3): 515–34. https://doi.org/10.1214/06-BA117A.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2020. Bayesian Data Analysis. 3rd ed. New York: Chapman; Hall/CRC. https://doi.org/10.1201/b16018.
Gelman, Andrew, Jessica Hwang, and Aki Vehtari. 2014. “Understanding Predictive Information Criteria for Bayesian Models.” Statistics and Computing 24 (6): 997–1016. https://doi.org/10.1007/s11222-013-9416-2.
Grekov, Paulina, James E. Pustejovsky, and David A. Klingbeil. 2025. “Flexible Distributional Models for Meta-Analysis of Reading Fluency Outcomes from Single-Case Designs: An Examination Using Bayesian Methods.” Journal of School Psychology 110 (June): 101429. https://doi.org/10.1016/j.jsp.2025.101429.
Kreft, Ita G. G., de Leeuw, and Leona S. and Aiken. 1995. “The Effect of Different Forms of Centering in Hierarchical Linear Models.” Multivariate Behavioral Research 30 (1): 1–21. https://doi.org/10.1207/s15327906mbr3001_1.
Manolov, Rumen, Patrick Onghena, and Wim Van den Noortgate. 2023. “Meta-Analysis of Single-Case Experimental Designs: How Can Alternating Treatments and Changing Criterion Designs Be Included?” Evidence-Based Communication Assessment and Intervention, January. https://www.tandfonline.com/doi/abs/10.1080/17489539.2022.2040164.
McElreath, Richard. 2020. Statistical Rethinking: A Bayesian Course with Examples in R and STAN. 2nd edition. Boca Raton London New York: Chapman; Hall/CRC.
Moeyaert, Mariola, Maaike Ugille, John M. Ferron, Patrick Onghena, Mieke Heyvaert, S. Natasha Beretvas, and Wim Van den Noortgate. 2015. “Estimating Intervention Effects Across Different Types of Single-Subject Experimental Designs: Empirical Illustration.” School Psychology Quarterly 30 (1): 50–63. https://doi.org/10.1037/spq0000068.
Shadish, W. R., E. N. Kyse, and D. M. Rindskopf. 2013. “Analyzing Data From Single-Case Designs Using Multilevel Models: New Applications and Some Agenda Items for Future Research.” PSYCHOLOGICAL METHODS 18 (3): 385–405. https://doi.org/http://dx.doi.org/10.1037/a0032964.
Shadish, William R., and Kristynn J. Sullivan. 2011. “Characteristics of Single-Case Designs Used to Assess Intervention Effects in 2008.” Behavior Research Methods 43 (4): 971–80. https://doi.org/10.3758/s13428-011-0111-y.
van Buuren, Stef, and Karin Groothuis-Oudshoorn. 2011. mice: Multivariate Imputation by Chained Equations in r.” Journal of Statistical Software 45 (3): 1–67. https://doi.org/10.18637/jss.v045.i03.
R version 4.4.2 (2024-10-31)
Platform: x86_64-conda-linux-gnu
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS/LAPACK: /opt/conda/lib/libopenblasp-r0.3.28.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] future_1.40.0    here_1.0.1       patchwork_1.3.0  ggdist_3.3.3    
 [5] bayesplot_1.12.0 tidybayes_3.0.7  mice_3.17.0      mgcv_1.9-3      
 [9] nlme_3.1-167     brms_2.20.4      Rcpp_1.0.14      lubridate_1.9.4 
[13] forcats_1.0.0    stringr_1.5.1    dplyr_1.1.4      purrr_1.0.4     
[17] readr_2.1.5      tidyr_1.3.1      tibble_3.2.1     ggplot2_3.5.2   
[21] tidyverse_2.0.0 

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3   tensorA_0.36.2.1     jsonlite_2.0.0      
  [4] shape_1.4.6.1        magrittr_2.0.3       jomo_2.7-6          
  [7] farver_2.1.2         nloptr_2.2.1         rmarkdown_2.29      
 [10] vctrs_0.6.5          minqa_1.2.8          base64enc_0.1-3     
 [13] htmltools_0.5.8.1    distributional_0.5.0 broom_1.0.8         
 [16] mitml_0.4-5          parallelly_1.43.0    StanHeaders_2.32.10 
 [19] htmlwidgets_1.6.4    extraDistr_1.10.0    plyr_1.8.9          
 [22] zoo_1.8-14           igraph_2.1.4         mime_0.13           
 [25] lifecycle_1.0.4      iterators_1.0.14     pkgconfig_2.0.3     
 [28] colourpicker_1.3.0   Matrix_1.7-2         R6_2.6.1            
 [31] fastmap_1.2.0        rbibutils_2.3        shiny_1.10.0        
 [34] digest_0.6.37        colorspace_2.1-1     ps_1.9.1            
 [37] rprojroot_2.0.4      crosstalk_1.2.1      labeling_0.4.3      
 [40] timechange_0.3.0     abind_1.4-8          compiler_4.4.2      
 [43] bit64_4.6.0-1        withr_3.0.2          backports_1.5.0     
 [46] inline_0.3.21        shinystan_2.6.0      QuickJSR_1.7.0      
 [49] pkgbuild_1.4.7       pan_1.9              MASS_7.3-64         
 [52] gtools_3.9.5         loo_2.8.0            tools_4.4.2         
 [55] httpuv_1.6.16        threejs_0.3.4        nnet_7.3-20         
 [58] glue_1.8.0           callr_3.7.6          promises_1.3.2      
 [61] grid_4.4.2           checkmate_2.3.2      reshape2_1.4.4      
 [64] generics_0.1.3       gtable_0.3.6         tzdb_0.5.0          
 [67] hms_1.1.3            foreach_1.5.2        pillar_1.10.2       
 [70] markdown_2.0         vroom_1.6.5          posterior_1.6.1     
 [73] later_1.4.2          splines_4.4.2        lattice_0.22-6      
 [76] survival_3.8-3       bit_4.6.0            tidyselect_1.2.1    
 [79] miniUI_0.1.2         knitr_1.50           reformulas_0.4.0    
 [82] arrayhelpers_1.1-0   gridExtra_2.3        stats4_4.4.2        
 [85] xfun_0.52            bridgesampling_1.1-2 matrixStats_1.5.0   
 [88] DT_0.33              rstan_2.32.7         stringi_1.8.7       
 [91] yaml_2.3.10          boot_1.3-31          evaluate_1.0.3      
 [94] codetools_0.2-20     cli_3.6.5            RcppParallel_5.1.10 
 [97] rpart_4.1.24         shinythemes_1.2.0    xtable_1.8-4        
[100] Rdpack_2.6.4         processx_3.8.6       globals_0.17.0      
[103] coda_0.19-4.1        svUnit_1.0.6         parallel_4.4.2      
[106] rstantools_2.4.0     dygraphs_1.1.1.6     Brobdingnag_1.2-9   
[109] lme4_1.1-37          listenv_0.9.1        glmnet_4.1-8        
[112] mvtnorm_1.3-3        scales_1.4.0         xts_0.14.1          
[115] crayon_1.5.3         rlang_1.1.6          shinyjs_2.1.0